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@ Seed specific transcriptional regulation. 

@ Nucleic acid sequences and methods for their use are 
provided which provide for seed specific transcription, in order 
to modulate or modify expression In seed, particulariy embryo 
cells. Transcriptional initiation regions are identified and 
isolated from plant cells and used to prepare expression 
cassettes which may then be transfomned into plant cells for 
seed specific transcription. The method finds, particular use in 
conjunction with modifying fatty acid production In seed tissue 
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Description 

SEED SPECIFIC TRANSCRIPTIONAL REGULATION 



INTRODUCnON 

5 

Technical Fleid 

Genetic modification of plant material is provided for seed specific transcription. Production of endogenous 
products may be modulated or new capabilities provided. 

10 Background 

The primary emphasis in genetic modification has been directed to prol<aryotes and mammalian cells. For a 
variety of reasons plants have proven more intransigent than other eukaryotic cells in the ability to genetically 
manipulate the plants. In part, this has been the result of the different goals involved, since for the most part 
plant modification has been directed to modifying the entire plant or a particular plant part in a live plant, as 
15 distinct from modifying cells in culture. 

For many applications, it will be desirable to provide for transcription in a particular plant part or at a 
particular time in the growth cycle of the plarit. Toward this end. there is a substantial interest in identitying 
endogenous plant products whose transcription or expression are regulated in a manner of interest. In 
identifying such products, one must first look for products which appear at a particular time in the cell growth 
20 cycle or in a particular plant part, demonstrate its absence at other times or in other parts. Identify nucleic acid 
sequences associated with the product and then identify the sequence in the genome of the plant In order to 
obtain the 5'-untranslated sequence associated with transcription. This requires substantial investigation In 
first identifying the particular sequence, followed by establishing that It Is; the correct sequence and Isolating 
the desired transcriptional regulatory region. One must then prepare appropriate constructs, followed by 
25 demonstration that the constructs are efficacious In the desired manner. 

Identifying such sequences is a challenging project, subject to substantlaf pitfalls and uncertainty. There is, 
however, substantial interest in being able to genetically modify plants, which justifies the substantial 
expenditures and efforts In Identifying transcriptional sequences and manipulating them to determine their 
utility, 

30 

Relevant Literature 

Crouch et a|., In: Molecular Form and Function of the Plant Genome , eds van VIoten-Doting. Groot and Hall, 
Plenum Publishing Corp. 1985. pp 555-566; Crouch and Sussex. Pianta (1981) 153:64-74; Crouch etal.. J. Mol. 
Appl. Genet. (1983) 2:273-283; and Simon etal.. Plant Molecular Biology (1985) 5: 191-201, describe various 
35 aspects of Brassica napus storage proteins. Beachy et al., EMBOJ. (1985) 4:3047-3053 ; Sengupta-Gopalan et 
al., Proc. Natl. Acad. Sci. USA (1985) 82:3320-3324; Greenwood and Chrispeels. Plant Physiol. (1985) 79:65-71 
and Chen et al., Proc. Natl, Acad. Sci. USA (1986) 83:8560-8564 describe studies concerned with seed storage 
. proteins and genetic manipulation. Eckes et al., Mol. Gen. Genet. (1986) 205:14-22 and Fluhr et al.. Science 
(1986) 232:1106-1112 describe the genetic manipulation of light inducible plant genes. 

40 

SUMMARY OF THE INVENTION 

DN A constructs are provided which are employed in manipulating plant cells to provide for seed-specific 
transcription. Particulariy, storage protein transcriptional regions are joined to other than the wild-type gene 
and introduced into plant genomes to provide for seed-specific transcription. The constructs provide for 
45 modulation of endogenous products as well as production of heterologous products. 

DESCRIPTION OF THE SPECIFIC EMBODIMENTS 

Novel DNA constructs are provided Which allow for modification of transcription in seed, particularly in 
embryos during seed maturation. The DNA constructs comprise a regulated transcriptional initiation region 

50 associated with seed formation, preferably In association with embryogenesls and seed maturation. Of 
particular interest are those transcriptional initiation regions associated with storage proteins; such as napin, 
cruciterin, p-conglycinin. phaseolin. or the like. The transcriptional initiation regions may be obtained from any 
convenient host, particularly plant hosts such as Brassica , e.g. napus or campestris . soybean (Glycine max ), 
bean ( Phaseolus vulgaris ), corn (Zeamays ). cotton ( Gossypium sp.), safflower ( Carthamus tinctorius ), tomato 

55 ( Lycopersican esculentum), and Cuphea species. 

Downstream from and under the transcriptional initiation regulation of the seed specific region will be a 
sequence of interest which will provide for modification of the phenotype of the seed, by modulating the 
production of an endogenous product, as to amount, relative distribution, or the like, or production of a 
heterologous expression product to prowde for a novel function or product in the seed. The DNA construct will 

60 also prowde for a termination region, so as to provide an expression cassette into which a gene may be 
introduced. Conveniently, transcriptional Initiation and termination regions may be provided separated in the 
direction of transcription by a linker or polylinker having one or a plurality of restriction sites for insertion of the 
gene to be under the transcriptional regulation of the regulatory regions. Usually, the linker will have from 1 to 
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10, more usually from about 1 to 8, preferably from about 2 to 6 restriction sites. Generally, the linker will be 
fewer than 100 bp, frequently fewer than 60 bp and generally at least about 5 bp. • • 

The transcriptional initiation region may be native or homologous to the host or foreign or heterologous to 
the host. By foreign Is Intended that the transcriptional initiation region Is not found In the wild-type host into ' 
which the transcriptional initiation region is introduced. 5 

Transcriptional. initiation regions of particular interest are those associated with the Brassica napus or 
campestris napin genes, acyl carrier proteins, genes that express from about day 7 td day 40 in sefed/ 
particularly having maximum expression from about day-10 to about day 20, where the exprfessod gene ts npt' . 
found in leaves, while the expressed product Is found In seed in high abundsaice. . ' 

The transcriptional cassette will include in the 6'-^' direction of transcription, a transcriptional and id ' 
translatlonal initiation region, a sequence of Interest, and a transcriptional and tnansiaf iohal terminEctlori regbn' 
functional in plants.. One or more introns may also be present. The DNA sequence m&yHave any open readbig ■ 
frame encoding a peptide of interest, e.g. an eHzyme, or a sequence complem^ritery to ft genomic sequence,;, 
where the genomic sequence may be an open reading frame, an introh, a hori-codrrig leader sequence, or any 
other sequence where the complementary secjuence will Inhibit transcription, messeriger RNA processing. • is 
e.g. splicing, or translation. The DNA sequence of interest may be synthetic, naturally dertved;'df cohiblnatlons * 
thereof. Depending upon the nature of the DNA seqUisnce of interest, It may be deslrstble to synthesize the 
sequence with plant preferred codons. The plant preferred codons may be deiBnt^n^etf^'frbm the cddoj1$;6f 
highest frequency in the proteins expressed in the largest amount In ^e particular pl£B?)t spedes of Thtefiest. 

In preparing the transcription cdssette, th6 various DNA fragments may be rriafflpiiated.'^so as to provide for . jgb 
the DNA sequences In the proper orientation and, as appropriate, tn the proper reading frame. Toward this 
end, adapters or linkers may be employed for joining the DNA fragments or bth^ rmfffptifatlons may be ^ 
involved to provide for convenient restriction sites, removal of superfluous DNA,.remdVai of restriction sites, or 
the like. For this purpose, in vitro mutagenesis, primer repair, restriction, anirreailng, resection, llgailori; or the 
like may be employed, where insertions, deletions or substitutions, e.g. transltk)ns and transverslons, may'^t^e 25 
involved. . 

The termination region which is employed will be primarily one of convisnldnce. sintie the termination regions 
appear to be relatively Interchangeable. The termination region may be native with the trtnsci^ptional Initiation 
region, may be native with the DNA sequence of Interest', or may be derived fforff aindther source. Corlvehfept 
termination regions are available from the Ti-ptasmid of A. tumefaciens , such 6^ th6 oc'toplne syfithase #id 30 
nopaline synthase termination regions. / ' 

By appropriate manipulations, such as restriction, chewing back or fiiHng in oVBrhiangS t6 provide blunt 
ends, ligation of linkers, or the like, complementary ends of the fragments can be'prdvided for Joining eend 
ligation. ' . . '* . 

In carrying out the various steps, cloning is employed, so as to amplify the amount of DNA and to allow for 35 
analyzing the DNA to ensure that the operations have occurred In a proper manner. A wide variety of cloning 
vectors are available, where the cloning vector includes a replication systenri functional Iri E. coll and a mark^l^ 
which allows for selection of the transformed cells. Illustrative vectors Include pBR33^ pUC series, M13rnp 
series, pACYC184, etc. Thus, the sequence may be Inserted into the vector art an appropriate restriction 
slte(s), the resulting plasmld used to transform the E. coll host, the E. coll grown fn an appropriate nutrient 40 
medium and the cells harvested and lysed and the plasmld recovered. Analysis msty Involve s^uence anaiysfs, 
restriction analysis, electrophoresis, or the like. After each manipulation the DNA-^equence to be Used in the . 
final construct may be restricted and joined to the next sequence, where'each of the'partlal cbhstiructs may be 
cloned in the same or different plasmlds. 

In addition to the transcription construct, depending upon the manner of introductlpn of the transcription ' 45 
construct Into the plant, other DNA sequences may be required. For example.'when using the Ti- or Ri-plasmid ' 
for transformation of plant cells, as described below, at least the right border and frequently both the right a 
left borders of the T-DNA of the Ti-and Ri-plasmids will be joined as flanking. regions to the transcription 
construct. The use of T-DNA for transformation of plant cells has received ekfehslyfe study and is amply 
described in EPA Serial No. 120,516, Hoekema, In: The Binary Plant Vector Systeni dftsef-drukkerij Kanters so 
B.V., Aibtasserdam, 1985. Chapter V. Fraiey, et al., Crit. Rev. Plant Scl. . 4:1-46. and An et al., EMBO J. (1985) 
4:277-284. 

Alternatively, to enhance integration into the plant genome, terminal repeats of transpbsoris may be used as 
borders in conjunction with a transposase. In this situation, expression of the trarisposase should be 
inducible, or the transposase inactivated, so that once the transcription constnidt Is integrated into the 55 
genome, it should be relatively stably integrated and avoid hopping. 

The transcription construct will normally be joined to a marker for selection in plant cells. Conveniently, the 
marker may be resistance to a bioclde, particulariy an antibiotic, such as k^amycin, G41 8, bleomycin, 
hygromycin, chloramphenicol, or the like. The particular marker employed wffl be one which will allow for 
selection of transformed cells as compared to cells lacking thfe DNA which has been ihtroduced. " €0 - • 

A variety of techniques are available for the Introduction of CNA into a plant cell Host. These techniques 
include transformation with Ti-DNA employing A. tumefaciens or A. rhizogenes as thfe trarisfonning agent, 
protoplast fusion, injection, eiectroporation, etc. For transformation with Agrobacferium . pifishilds can be 
prepared in E. coli which plasmids contain DNA homologous with the Ti-plasmid, particularly T-DNA. The 
plasmid may or may not be capable of replication in Agrobacterium . that is, it may or may not hdve a broad 65. 
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spectaim prokaryotic replication system. e.g. RK290. depending in part upon whether the transcription 
construct is to be Integrated into the Ti-plasmid or be retained on an independent plasmid. By means of a 
helper plasmid, the transcription construct may be transfen-ed to the A. tumefaciens and the resulting 
transformed organism used for transforming plant cells. 
5 Conveniently, explants may be cultivated with the A. tumefaciens or A. rhizogenes to allow for transfer of the 
transcription construct to the plant cells., the plant cells dispersed in an appropriate selective medium for 
selection, grown to callus, shoots grown and plantlets regenerated from the'shoots by growing in rooting 
medium. The Agrobacterium host will contain a plasmid having the vir genes necessary for transfer of the 
T-DNA to the plant cells and may or may not have T-DNA. For injection and electroporation, disarmed' 
10 Ti-plasmids (lacking the tumor genes, particulariy the T-DNA region) may be introduced into the plant cell. 
The constructs may be used in a variety of ways. Particulariy, the constructs may be used to modify the fatty 
acid composition in seeds, that is changing the ratio and/or amounts of the various fatty acids, as to length, 
unsaturation, or the like. Thus, the fatty acid composition may be varied, enhancing the fatty acids otfrom 10 to 
14 carbon atoms as compared to the fatty acids of from 16 to 18 carbon atoms, Increasing or decreasing fatty 
15 ' acids of from 20 to 24 carbon atoms, providing for an enhanced proportion of fatty acids which are saturated or 
unsaturated, or the like. These results can be achieved by providing for reduction of expression of one or more 
endogenous products, particulariy enzymes or cofactors, by producing a transcription product which is 
complementary to the transcription product of a native gene, so as to inhibit the maturation and/or expression 
of the transcription product, or providing for expression of a gene, either endogenous or exogenous, 
20 associated with fatty acid synthesis. Expression products associated vwth fatty acid synthesis Include acyl 
carrier protein, thioesterase, acetyl transacylase. acetyl-coA carboxylasem, ketoacyl-synthases. malonyl 
transacylase. stearoyl-ACP desaturase, and other desaturase enzymes. 

Alternatively, one may wish to provide various products from other sources including mammals, such as 
blood factors, lymphokines, colony stimulating factors, interferons, plasminogen activators, enzymes, e.g. 
25 superoxide dismutase, chymosin, etc., hormones, rat mammary thioesterase 2. phospholipid acyl desaturases 
Involved in the synthesis of cicosapentaenoia acid, human serum albumin. Another purpose is to increase the 
level of seed proteins, particularly mutated seed proteins, tiavlng an improved amino acid distribution which 
would be better suited to the nutrient value of the seed. In this situation, one might provide for inhibition of the 
native seed protein by producing a complementary DMA sequence to the native coding region or non-coding 
30 region, where the complementary sequence would not efficiently hybridize to the mutated sequence, or 
inactivate the native transcriptional capability. 
The cells which have been transformed may be grown into plants In accordance with conventional ways. 
. See, for example, McComiick et al.. Plant Cell Reports (1986) 5:81-84. These plants may then be grown, and 
either pollinated with the same transformed strain or different strains, identifying the resulting hybrid having 
35 the desired phenofypic characteristic. Two or more generations may be grown to ensure that the subject 
phenofypic characteristic is stably maintained and inherited and then seeds harvested to ensure the desired 
phenotype or other property has been achieved. 

As a host cell, any plant variety may be employed which.provldes a seed of Interest. Thus, for the most part, 
plants will be chosen where the seed is produced in high amounts or a seed specific product of interest is 
40 involved. Seeds of interest include the oil seeds, such as the Brassica seeds, cotton seeds, soybean, 
safflower, sunflower, or the like; grain seeds, e.g. wheat, bariey, rice, clover, com. or the like. 

identifying useful transcriptional initiation regions may be achieved in a number of ways. Where the seed 
protein has been or is isolated, it may be partially sequenced, so that a probe may be designed for identifying 
messenger RNA specific for seed. To further enhance the concentration of the messenger RNA specifically 
45 associated with seed, cDNA may be prepared and the cDNA subtracted with messenger RNA or cDNA from 
non-seed associated cells. The residual cDNA may then be used for probing the genome for complementary 
sequences, using an appropriate library prepared from plant cells. Sequences which hybridize to the cDNA 
may then be isolated, manipulated, and the 5'-untranslated region associated with the coding region isolated 
and used in expression constructs to identify the transcriptional activity of the 5'-untranslated region. 
50 In some instances, the research effort may be further shortened by employing a probe directly for screening 
a genomic library and identifying sequences which hybridizie to the probe. The sequences will be manipulated 
as described above to identify 5'-untranslated region. 

The expression constructs which are prepared employing the 5'-untranslated regions may be transformed 
into plant cells as described previously for determination of their ability to function with a heterologous 
55 structural gene (other than the wild-type open reading frame associated with the 6'-untranslated region) and 
the seed specificity. In this manner, specific sequences may be Identified for use with sequences for seed • 
specific transcription. Expression cassettes of particular interest include transcriptional initiation regions from 
napin genes, particularly Brassica napin genes, more particularly Brassica napus or Brassica campestris 
genes, regulating structural genes associated with lipid production, particularly fatty acid production, including 
60 acyl carrier proteins, which may be endogenous or exogenous to the particular plant, such as spinach acyl 
carrier protein. Brassica acyl carrier protein, acyl carrier protein, either napus or campestris . Cuphea acyl 
carrier protein,, acetyl transacylase, malonyl transacylase, p-ketoacyl synthases I and 11. thioesterase, 
particulariy thio esterase II. from plant, mammalian, or bacterial sources, for example rat thioesterase II, acyl 
ACP. or phospholipid acyl desaturases. 
65 The following examples are offered by way of illustration and not by way of limitation. 
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EXPERIMENTAL 
Materials and Methods 

5 

Cloning Vectors 

Cloning vectors used include the pUC vectors, pUC8 and pUC9 (Vieira and Messing, Gene (1982) 
- 19:259-268); pUCIB and pUC19 (Norrahder et al.> Gene (1983) ^:10M06; Yanlsch-Perrprvet a!., Gene (1985) 
33:103-119), and analogous vectors exchanging chloramphenicol resistance (CAM) as aTharker for the 
, ampiclllin resistance of the pUC plasmids described above (pUC~CAM [pUC12-Cm. pUC13-Cm] Buckley, D.. . . io 
Ph.D. Thesis, U.C.S.D., CA 1985). The multiple cloning sites of pUCIS and pUC19 vectors v/ere exchanged wfth 
those of pUC-CAM to create pCGN565 and pCGN566 which are CAM resistant. Also used were pUC1 18 and 
pUC119, which are respectively, pUC18 and pUC19 with the intergenic region of M13. from an KgiAi site at 
5465 to the Ahalll site at 5941, inserted at the Ndel site of pUC. (Available from Vieira J. and Messing; J. 
Wal<sman Institute. Rutgers University, Rutgers, N.J.) /5 

Materials / ' . 

Terminal deoxynucleotide transferase (TDT). RNaseH, E. coll DNA polymerase, T4 kinase, and restriction 
enzymes were obtained from Bethesda Research l-aboratories; E. coli DNA llgase was obtained from Ne^w 
England Blolabs; reverse transcriptase was obtained from Life Sciences, Inc.; Isotbpes wfere obtained from 20 
Amersham; X-gal was obtained from Bachem, Inc. Torrance, OA. 

Example I 

Constmctlon of a Napin Promoter 2S 

There are 298 nuclotides upstream of the ATG start codon of the napin gene on the pgN1 clone, a 3.3 kb 
EcoRI fragment of B. napus genomic DNA containing a napin gene cloned Into pUC8 {available from Marti 
Crouch. University of Indiana). pgNI DNA was digested with Eco RI and SstI and ligated to EcoRi/SstI digested 
pCGN706. (pCGN706 is an Xhol/PstI fragment containing 3' and polyadenylatlori set|uences of another napin 
cDNA clone pN2 (Crouch et al.. 1983 supra ) cloned In peGN566 at the Sail and PstI sites.) The resulting clone 30 
pCGN707 was digested with Sail and treated with the enzyme Bai31 to remove some df . the coding region of 
the napin gene. The resulting resected DNA was digested with Smal atter the Bal31 treatment and religated. 
One of the clones, pCGN713, selected by size, was subcloned by Eco RI aijd Bamlf il digestion Into both 
EcoRI/BamHI digested pEMBL18 (Dente et al., Nucleic Acids Res. (1983) r[:1645-1655) and pUC118 to give 
E418 and E4118 respectively. The extent of Bal31 digestion was conformed by Stinger dldeoxy sequencing of 35 
E418 template. The Bal31 deletion of the promoter region extended on^ to 57 nucleotides downstream of the 
start codon, thus containing the 5' end of the napin coding sequence and about 300 bp-of the 5' nonn^oding 
region. E41 18 was tailored to delete all of the coding region of napin including the ATG start codon by in. vitro 
mutagenesis by the method of Zoller and Smith ( Nucleic Acids Res. (1982) lb;6487-650a) using an . 
oligonucleotide primer 5'-GATGTnTGTATGTGG6CCCCTAGGAGATC-3'. Screening for the appropriate 40 
mutant was done by two transformations into E. coli strain JM83 (Messing J., In : Recombinant DNA Technical 
Bulletin. NIH Publication No. 79-99, 2 No. 2, 1979, pp 43-48) and Sma l digestion of putative transformants. The 
resulting napin promoter clone Is pCGN778 and contains 298 nucleotides from the Eco RI site of pgNi to the A 
nucleotide just before the ATG start codon of napin. The promoter region was .subcloned. Into a' 
chloramphenicol resistant background by digestion with EcoRI and BamHI and ligation to EcoRI /Bam HI 45 
digested pCGN565 to give pCGN779c. .1.. 

Extension of the Napin Promoter Clone 

pCGN779c contains only 298 nucleotides of potential 5'-reguiatory sequence. The napin promoter was 
extended with a 1 .8 kb fragment found upstream of the 5'-EcoRI site on the original ^nNa done. The - 3.5 kb 50 
Xhoi fragment of A.BnNa (available from M. Crouch), which includes the napin region, was subcloned into 
Sall-digested pUC119 to give pCGN930. A Hindlll site close to a 5^ Xhol site was used to subclone the 
Hindlli/EcoRl fragment of pCGN930 into HIndlll/EcoRI digested Bluescript + (Vector Cloning Systems, San 
Diego. CA) to give pCGN942. An extended napin promoter was made" by llgating pCGN779c digested with 
EcoRI and PstI and pCGN942 digested with Eco RI and PstI to make pCGN943. This promoter contains -2.1 55 
kb of sequence upstream of the original ATG of the napin gene contained on A.BnNa. A partial sequence of the 
promoter region is shown in Figure 1. 

Napin Cassettes 

The extended napin promoter and a napin 3'-reguiatory region is combined to make a napin cassette for 60 
expressing genes seed-specifically. The napin 3-region used is from the plasmid pCGNl524 containing the 
Xhol/EcoRt fragment from pgNI (Xhol site is located 18 nucleotides from the stop codon of the napin gene) 
subcloned into EcoRI/Sall digested pCGN565. HIndlll/PstI digested pCGN943 and pCGN1924 are ligated to 
make the napin cassette pCGN744, with unique cloning sites Smal, Sail, and. PstI for inserting genes. 
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Construction of cDNA Library from Spinach Leaves 

Total RNA was extracted from young spinach leaves In 4M guanldine thiocyanate buffer as described by 
Facciotti et al. (Biotechnology (1985) 3:241-246). Total RNA was subjected to oligo(crr) -cellulose column 
chromatography two times to yield poly(A)+ RNA as described by Maniatis et al., (1982) Molecular Cloning: A 
Laboratory Manual , Cold Spring Harbor Laboratory, New York. A cDNA library was constructed in pUC13-Cm 
aaccordlng to the method of Gubler and Hoffman, (Gene (1983) ^:263-269) with slight modifications, RNasin 
was omitted in the synthesis of first strand cDNA as it Interfered with second strand synthesis if not completely 
removed, and dCTP was used to tail the vector DNA and dGTP to tail double-stranded cDNA instead of the 
reverse as described in the paper. The annealed cDNA was transformed to competent E. coH JM83 (Messing 
(1979) supra ) cells according to Hanahan (J. Mol. Biol. (1983) 166:557-580) and spread onto LB agar plates 
(Miller (1972) Experiments in Molecular Genetics. Cold Spring Harbor Laboratory, Cold Spring Harbor. New 
York) containing 50 jig/ml chloramphenicol and 0.005o/o X-Gal. 

Identification of Spinach ACP-I cDNA 

A total of approximately 8000 cDNA clones were screened by performing Southern blots (Southern. J. Mol. 
BioL (1975) 98:503) and dot blot (described below) hybridizations with clone analysis DNA from 40 pools 
representing 200 cDNA clones each (see below). A 5' end labeled synthetic oligonucleotide (ACPP4) that is at 
least 660/0 homologous with a 16 amino acid region of spinach ACP-I (5'-GATGTCTTGAGCCTTGTCCTCATC- 
CACATTGATACCAAACTCCTCCTC-3') is the complement to a DNA sequence that could encode the 16 amino 
acid peptide glu-glu-glu-phe-gly-IIe-asn-val-asp-^lu-asp-lys-ala-gln-asp-ile, residues 49-64 of spinach ACP-I 
(Kuo and OhIrogge, Arch. Biochem. Biophys. (1984) 234:290-296) and was used for an ACP probe. 

Clone analysis DNA for Southern and dot blot hybridizations was prepared as follows. Transformants were 
transferred from agar plates to LB containing 50 [tg/ml chloramphenicol in groups of ten clones per'10 ml 
media Cultures were incubated ovemight In a 37** C shaking Incubator and then diluted with an equal volume of 
media and allowed to grow for 5 more hours. Pools of 200 cDNA clones each were obtained by mixing 
contents of 20 samples. DNA was extracted from these cells as described by Bimboim and Doly (Nucleic 
Acids Res. (1979) 7:1513-1523). DNA was purified to enable digestion with restriction enzymes by extractions 
with phenol and chloroform followed by ethanol precipitation. DNA was resuspended in sterile, distilled water 
and 1 Jig of each of the 40 pooled DNA samples was digested with EcoRI and HIndlll and electrophoresed 
through 0.70/o , agarose gels. DNA was transferred to nitrocellulose filters following the blot hybridization 
technique of Southern. * 

ACPP4 was 5' end-labeled using y-^ap ^fijp gj^^ 74 kinase according to the manufacturer's specifications. 
Nitrocellulose filters from Southern blot transfer of clone analysis DNA were hybridized (24 hours, 42'' C) and 
washed according to Berent et al. ( BioTechniques (1985) 3:208-220). Dot blots of the same set of DNA pools 
were prepared by applying 1 |ig of each DNA pool to nylon membrane filters in 0.5 M NaOH. These blots were 
hybridized with the probe for 24 hours at 42** C in 50% formamlde/1o/o SDS/1 M NaCL, and washed at room 
temperature In 2X SSC/O.I0/0 SDS (IX SSC = 0.15M NaCl; 0.015M Na citrate; SDS-Sodium dodecylsulfate). 
DNA from the pool which was hybridized by the ACPP4 ollgoprobe was transformed to JM83 cells and plated 
as above to yield individual transformants. Dot blots of these Individual cDNA clones were prepared by 
applying DNA to nitrocellulose filters which were hybridized with the ACPP4 oligonucleotide probe and 
analyzed using the same conditions as for the Southern blots of pooled DNA samples. 

Nucleotide Sequence Analysis 

The positive clone. pCGNISOL, was analyzed by digestion with restriction enzymes and the following partial 
map was obtained. 
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The cDNA clone was subcloned into pUC118 and pUC119 using standard laboratory techniques of 
restriction, ligation, transformation, and analysis (Maniatis et al., (1982) supra ). Single-stranded DNA template 
was prepared and DNA sequence was determined using the Sanger dideoxy technique (Sanger et al., (1977) 
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Proc. Nat. Acad. Sci. USA 74:5463-5467). Sequence analysis was performed using a software package from 
Intelll-Genetics, Inc. / ' 

pCGNISOL contains an (approximately) 700. bp cDNA Insert Including a stretbh of A residues at the. 3' 
terminus which represents the poiy(A) tall of the mR MA. An ATG codoh at positiort 61 Is presumed to encode 
the WET translation initiation codon. This codoh is the start of a 411 nucleotide open reaWirigframe, of whidh, 
nucleotides 229-471 encode a protein whose amino dcid sequence corresponds almost perfectly with the pub - 
lished amino acid sequence of ACP-I of Kuo and Ohirogge supra as described previousty. In' addition to mature ' 
protein^ the pCGNISOL also encodes a„56 residue transit peptide sequence, as rnlght be' expected for a 
nuclear-encoded chloroplast protein. 
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Napin - ACP Construct 

pCGN796 was constructed by Ilgating pCGNISOL digested v/rth Hindi ll/BamHl. pUC8 digested With HIndlll 
and BamHI and pUC118 digested with Bam HI. The ACP gene from pCGN796 was transferred into a 
chloramphenicol background by digestion with Bam HI and ligation with Bam HI digested pCGN565. The 
resulting pCGN1902 was digested with Eco RI and Smal and iigated to EcoRI /Sma l digested. pUC1 18 to give 15 
pCGN1920. The ACP gene in pCGN1920 was digested at th6 Ncol site, filled in by treatment wfth the Klenew 
fragment, digested with Sma l and religated to form pC?GN1919. This eliminated the 5'-coding sequences fr6m 
the. ACP gene and regenerated the ATG. This ACP igene was flanked with PstI sites by digesting pCGN191^9 ' 
with EcoRl, filling in the site with the Klenow fragment and ligating a Pstl linker, this clone is called pCGN945. 

The ACP gene of pCGN945 was moved as a Bam HI/PstI fragment to pUC1 18 digested with Bam HI and Pstl 20 
to create pCGN945a so that, a Sma l site (provided by the pUC1 18) would be at the 5'-end of the ACP- 
sequences to facilitate cloning Into the napin cassette pCGN944. pCGN945a digested with Smal and Pstl was 
Itgated to pCGN944 digested with Sma l and Pstl to produce the hapin ACP cassette pCGN946. The napin ACP 
cassette was then transfen-ed into the binary vector 'pCGN783 by cloning from the Hindlll site to produce 
pCGN948. ■ ':• 25 

Construction of the Binary Vector pCGN783 

pCGN783 is a binary plasmid containing the left and right T-DNA borders of A. tumefaciens (Barker et al., 
Plant Mol. BioL (1983) 2:335-350); the gentamicin resistance gene of pPHiJI (Hirsch et al..' Plasmid (1984). 
12:139-141) the 35S promoter of cauliflower mosaic virus (CaMV) (Gardner et aJ., Nucleic Acids Res. (1981) 30 
.9:2871-2890), the kanamycin resistance gene of Tn5 (Jorgenson- et infra and Wolff et al., teid (19815) 
'13:355-367) ahd the 3' region from transcript 7*of pT\AS (Barker et' al.', supra (1983)^). 

To obtain the gentamicin resistance marker, the gentamicin resistarice gene was Isolated from a 3.1 kb 
EcoRl-Pstl fragment of pPHIJI and cloned Into pUC9 yielding pCGN549. The Htndlll -Bam HJ fragment 
containing the gentamicin resistance gene was substituted for the Hiridlll-BgHI fragment of pCGN587 creating 35' 
PCGN594. . 

pCGN587 was prepared as follows: The Hindlll-Smal fragment of Tn5 containing the entire structural gene 
for APHll (Jorgenson et al.. Mol. Gen. Genet (1979) 177:65) was cloned into pUC8 (Vleira and Messing, Gene 
(1982) 19:259), converting the fragment into a HIndlll-EcoRI fragment, since there is an' EcoRI site fmmediately 
adjacent to the Sma ! site. The Psti -Eco Rl fragment containing the 3'-portlon of the APHll gene was then 40 
combined with an EcoRI-BamHI-Sall-PstI linker into the Eco Rl site of pUC7 (pCGN546W). Since this construct 
does not confer kanamycin resistance, kanamycin resistance was obtained by inserting the Bglll-PstI fragment 
of the APHll gene into the Bam HI-PstI site (pCGN546X). This procedure reassembles the APH ll gene, so that . 
Ego sites flank the gene. An ATG codon was upstream from and out of reading frame with the ATG initiation 
codon of APH jl. The uhdesired ATG was avoided by inserting a Sau3A-Pstl fragment from the 5 -end of APH ll. 45 
which fragment lacks the superfluous ATG, Into the BamHI-PstI site of pCGN546W fo provide plasmid 
PCGN560. 

The EcoRl fragment containing the APHll gene was then cloned into the unique Eco Rl srte of pCGN451, 
which contains an octopine synthase cassette for expression, to provide pCQN552 (1ATG). 

pCGN451 includes an octopine cassette which contains about 1556 bp of the 6' non-6oding region fused via SO 
. an Eco Rl linker to the 3' non-coding region of the octopine synthase gene of pTlA6. The pTP coordinates are 
1 1 .207 to 1 2.823 for the 3' region and 13,643 to 1 5,208 for the 5' region as defined by "Barker et al., Plant Mol. 
BioK' (1983) 2:325. 

The 5' fragment was obtained as follows. A small subcloned fragment containing ttte 5' end of the coding 
region, as a BamHI-EcoRJ fragment was cloned in pBR322 as plasmid pCGN407. The Bam Hl -Eco RI fragment 55. 
has an Xmn I site in the coding region, while pBR322 has two Xmn l sites. pCGN407 was digested with Xmn I, 
resected with BalSI nuclease and Eco Rl linkers added to the fragments. After EcoRl and Bam HI digestion, the 
fragments were size fractionated, the fractions Cloned and sequenced. In pne cas6, the entire coding region . ". 
and 10 bp of tfie 5' non-translated sequences had been removed jeaving the 5' npoh-trarislated region, the 
mRNA cap site and 16 bp of the 5' non-translated" region (to a Bam HI site) iritact. This snt&il fragmernt was 60 
obtained. by size fractionation on a 70/o acrylamide gel and fragments approximately 1 30 bp long eluted. 

This size fractionated DNA was Iigated into M13mp9 and several clones sequenced and the sequence ' " 
compared to the known sequence of the octopine synthase gene. The M13 construct was designated pi4, " 
which plasmid was digested with BamHI and Eco Rl to provide the smal! fragment which wsfcs Iigated to a Xhol 
to BamHI fragment containing upstream 5' sequences frpm pTiAS (Garfinkel and Nester, J. Bacterid . (1980) 65 
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144:732) and to an Eco Ri to Xhol fragment containing the 3' sequences. 

The resulting Xho l fragment was cloned into the Xho ! site of a pUC8 derivative, designated pCGN426. This 
plasmid differs from pUC8 by having the sole Eco RI site filled in with DNA polymerase L and having lost the PstI 
and HindlH site by nuclease contamination of Hindi restriction endonuclease, when a Xho l linker was inserted 
5 into the unique Hind! site of pUC8. The resulting.plasmid pCGN451 has a single Eco Ri site for the insertion of 
protein coding sequences between the 5' non-coding region (which contains 1 ,550 bp of 5^ non-transcribed 
sequence including the right border of the T-DNA. the mRNA cap site and 16 bp of 5' non-transiated 
sequence) and the 3' region (which contains 267 bp of the coding region; the stop codon, 196 bp of 3' 
non-translated DNA, the polyA site and 1,153 bp of 3' nbn-trariscribed sequence). pCGN451 also provides the 
10 right T-DNA border. 

The resulting plasmid pCGN451 having the ocs 5' and the ocs 3' in the proper orientation was digested with 
EcoRI and the Eco RI fragment from pCGNSSI containing the intact kanamycin resistance gene Inserted into 
the Eco RI site to provide pGGN552 having the kanamycin resistance gene in the proper orientation. 
This ocs/KAN gene was used to provide a selectable marker for the trans type binary vector pCGN587. 

15 The 5' portion of the engineered octopine synthase promoter cassette consists of pTiA6 DNA from the Xhol 
at bp 15208-13844 (Barker's numbering), which also contains the T-DNA boundary sequence (border) 
implicated in T-DNA transfer.. In the plasmid pCGN587. the ocs/KAN gene from pCGN552 provides a 
selectable mari<er as well as the right border. The left boundary region was first cloned in M13mp9 as a 
Hindlil -Sma l piece (pCGN502) (base pairs 602-2213) and recloned as a Kpn i-EcoRI fragment In pCGN565 to 

20 provide pCGN580. pCGN565 is a cloning vector based on pUC8-Cm, but con taining pUC18 linkers. pCGN580 
was linearized with Bam HI and used to replace the smaller Bglil fragment of pVCK102 (Knauf and Nester, 
Plasmid (1982) 8:45), creating pCGN585. By replacing the smaller Sail fragment of pGGN585 with the Xhol 
fragment from pCGN552 containing the ocs/KAN gene, pCGN587 was obtained. 
The pCGN594 Hrndlll -BamH I region, which contains ain 5'-ocs-kanamyoin-ocs-3' (ocs Is octopine synthase 

25 . with 5' designating the promoter region and 3' the terminator region, see U.S. application serial no. 775,923, 
filed September 13, 1985) fragment was replaced with the Hindlll -Bam HI polyllnker region from pUCI-8. 

pCGN566 contains the EcoRI-Hindlll linker of pUC18 inserted into the EcoRl-Hindlll sites of pUC13-Cm.The 
Hindlil-Bglll fragment of p NWS 10-8,29-1 (Thomashow et al., Ceil (1980) 19:729) containing ORFI and -2 of 
pTiAS was subcloned into the Hindlll -Bam HI. sites of pCGN566 producing pCGN703. 

30 The Sau 3A fragment of pCGN703 containing the 3' region of transcript 7 (corresponding to bases 2396-2920 
of pTiA6 (Barker etal.. (1983) supra ) was subcloned into the Bam HI site of pUG18 producing pCGN709. The 
EcoR I -Sma l polyllnker region of pCGN709 was substituted with the EcoRI-Smal fragment of pCGN587, which . 
contains the kanamycin resistance gene (APH3 -1I) producing pCGN726. 
The EcoRi-Sall fragment of pGGN726 plus the BglH-EcoRl fragment of pCGN734 were inserted into the 

35 Bam HI-Sali site of pUC8-Cm producing pCGN738. pCGN726c is derived from pCGN738 by deleting the 900 bp 
Eco Ri -Eco Ri fragment. 

To construct pCGN167, the Alul fragment of CaMV (bp 7144-7735) (Gardner et al.. Nuci. Acid Res. (1981) 
9:2871-2888) was obtained by digestion with A[ul and cloned into the Hindi site of M13rap7 (Messing et al., 
Nuci. Acids Res. (1981) 9:309-321) to create 0614. An Eco RI digest of C614 produced the Eco RI fragnient 
40 . from 0614 containing the 35S promoter which was cloned into the Eco Ri site-of pU08 (Vieira and Messing, 
Gene (1982) 19:259) to produce pCGN146.' ■ 

To trim the promoter region, the Bglll site (bp 7670) was treated with Bglil and. resected with Bal31 and 
subsequently a Bglil linker was attached to the Bal31 treated DNA to produce pCGN147. 
pCGN148a containing a promoter region, selectable marker (KAN with 2 ATG's) and 3' region, was prepared 
45 by digesting pCGN528 with Bglil and inserting the Bam HI-Bglll promoter fragment from pCGN147. This 
fragment was cloned into the Bglil site of pCGN528 so that the Bglil site was proximal to the kanamycin gene of 
•:pCGN528. 

The shuttle vector used for this construct, pOGN528, was made as follows. pOGN525 was made by 

digesting a plasmid containing Tn5 which harbors a kanamycin gene (Jorgenson et al.. MoL Gen. Genet. 1979) 
50 ' 177:65) with Hindlll -Bam HI and inserting the Hindlll -Bam HI fragment containing the kanamycin gene into. the 

Hindlll-BamHI sites in the tetracycline gene of pACYC184 (Chang and Cohen, J. Bacteriol. (1978) 

134:1141-1156). pCGN526 was made by inserting the Bam HI fragment 19 of pTiA6 (Thomashow et al., Celf 
. (1980) 19:729-739). modified with Xbol linkers inserted into tiie Sma l site, Into the BamHI site of pCGN525. 

pCGN528 was obtained by deleting the small Xho l fragment from pCGN526.by digesting with Xhpl and 
55 religating. 

pCGN149a was made by cloning the Bam Hl-kanamycin gene fragment from pMB9KanXX! Into the Bam HI 
site of pCGN148a. 

pMB9KanXXl is a pUC4K variant (Vieira and Messing, Gene (1982) 19:259-268) which has the Xho l site, 
missing but contains afunctional kanamycin gene from Tn903 to allow for efficient selection in Agrobacterium . 
60 pCGN149a was digested with Bglil and Sph l. This small Bg(il -Sph i fraigment of pCGN149a was replaced with 
• the Bam HI-Sphl fragment from Mi (see below) Isolated by digestion with Bam HI and Sphl This* produces 
pCGN167, a construct containing a full length CaMV- promoter, 1 ATG-kanamycin gene, 3' end and the bacterial 
Tn903-type kanamycin gene. Ml is an Eco RI fragment from pCGN546X (see construction' of pCGN587) arid 
was cloned into the Eco RI cloning site of M13mp9 in such a way that the Pstl site in the 1 ATG-kanamycin gene 
65 was proximal to the polylinker region of M13mp9. 
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The Hindlll-BamHI fragment In the pCGN167 containing the CaMy-355 promoter, 1ATG-kanamycin gene 
and the BamHI-fragment 19 of pTiAS was cloned into the BamHl -Hindi H srtes of pUC19 creating pCGNS^B. The 
35S promoter and 3' region from transcript 7 was developed by inserting a 0.7 kb HindlS-EcoRI fragment of 
PCGN976 (35S promoter) and the 0.5 kb EcoRI-Saul fragment of pCGN709 (transcript 7:30 into the Klndlll-Sall 
sites of pCGN566 creating pCGN766c. ~ 

The 0.7 kb HIndiil-EcoRi fragment of pCGN766C (CalW-35S promoter) was legated to the 1 .6 kb EcoRI-Sall 
fragment In pCGN726c (1ATG-KAN 3' region) followed by insertion Into the Hindlll-Sali: sites of pUC119lo 
produce pCGN778. The 22 kb region of pGGN778, Hindlll-Sall fragment containing the, C^lMV-SSS promoter 
and 1 ATG-KAN-3' region was used to replace the Hindlll-Sall linker region of pCGfiJ/SS produce pCGN783. 

Transfer of the Binary Vector pCGN948 into AgrobacteHum 

pCGN948 was introduced Into Agrobacterium tumefaciens EHA101 (Hood et al.. J. BactertoL (1986) 
168:1291-1301) by transformation. An overnight 2 ml culture of EHA101 was grown in MG/L broth at 36°C. 0.5 
ml was inoculated into 100 ml of MG/L broth (GarflnMfel and Nester, J. Bacteriol. (1980) 144:732-743) and grown 
In a shaking incubator for 5 h at 30^*0. The cells were pelleted by centrffugafloh atVK, resuspended in 1 ml of 
MG/L broth and placed on ice. Approximately, 1 ^ig of pCGN948 DNA was placed In iOQ:^ of MG/L broth to 
which 200 \l\ of the EHA101 suspension was added; the tube containing the DNA-ceil mlx Immediate^, 
placed into a dry ice/ethanol bath for 5 minutes. The tube Was quick thawed by 5 miriu6§ in ait"' C water bath 
followed by 2 h of shaking at 30° C after adding 1 ml of fresh MG/L medium. The cells were peUeted and spread 
onto MG/L plates (1.5iVb agar) containing 100 mg/l gentamlcln. Piasmid DNA was Isolated from individual 
gentamicin-resistant colonies, transformed back Into E. coli. and characterized by restriction enzyme analysis 
to verify that the gentamicin-resistant EHA101 contained intact copies of pGGN94iBi Single colonies are picked 
and purified by two more streakings on MG/L plates containing 100 mg/l gerjtamicin. 

Transformation and Regeneration of B. Napus 

Seeds of Brassica napus cv Westar were soaked in 95^ ethand for 4 minutes. They Were sterilized in 10^ 
. solution of sodium hypochlorite wrth 50 |i! of "Tween 20" surfactant per 100 ml sterifent solution. After soaking 
for 45 minutes, seeds were rirtsed 4 times with sterile distilled water. They were ptented In sterile plastic 
boxes 7 cm wide, 7 cm long, and 10 cm high (Magenta) containing 50 mi of 1/1 0th conceritratipn of MS 
(Murashige minimal prganics medium, Gibco) with added pyridoxine (500 jig/l). nicotinic acidX50 jig/l). glycine 
(200 \ig/\) and solidified with O.60/0 agar. The seeds germinated and were grbwn at 22° C In a 16h-8h iight-dari< 
cycle with light intensity approximately 65 [iEm-2s-i. After 5 d»/s the Seedlings were. taken. under sterile 
conditions and the hypocotyls excised and cut Into pieces of about 4 mm In length. TTlie.hypocotyl segments 
were placed on a feeder plate or without the feeder layer on top. of a filter paper on the gbtdifled B5 0/1/1 or 
B5 0/1/0 medium. B5 0/1/0 medium contains B5 salts and vitamins (Gamborg, Miller and dpma, Experimentai 
Ceil Res. (1968) 50:151-158), 30/0 sucrose, 2,4-dlchlorophenoxyacetic acid (t.O mg/l), pH adjUted to 6.6, and 
the medium is soiidifed with O.6O/0 Phytagar; B5 0/1/1 is the same with the addition'of ;1,0 mg/l krnetin. Feeder . 
plates were prepared 24 hours in advance by pipetting 1 .0 mi of a stationary pha'se tobaccd suspension culture* 
(maintained as described In FillattI et al., Molecular General Genetics (1987) ^:19g-199) onto 85' 0/1/0 or 
B5 0/1/1 medium. Hypocotyl segments. were cut and placed on feeder piktes 24 hours prtbH to Agrobacterium 
treatment * ^ — '■ — 

Agrobacterium tumefaciens (strain EHAIOI x 948) was prepared by Incubating a single colony of. 
Agrobacterium in MG/L broth at 30°C. Bacteria were harvested 16 hours later and dilutions of 10^ bacterial per 
ml were prepared In MG/L broth. Hypocotyl segments were inoculated with bacteria by placing the segment^ 
in an Agrobacterium suspension and allowing them to sit for 30-60 minutes, then removing and transferring to 
Petri plates containing B5 0/1/1 or 0/1/0 medium (0/1/1 intends 1 mg/l 2,4-D and 1 mg/1kinetln and 0/1/6 
intends no kinetin). The plates were Incubated in iow light at 22* C.' The c6-incubation of bacteria with the 
.hypocotyl segments took place for 24-48 hours. The hypocotyl segments. were removed and pladed oh 
B5 0/1/1 or 0/1-/0 containing 500 mg/l carbenicillin (kanamycin sulfate at 10. 25. or .60 mg/l was sometimes' 
added at this time) for 7 days in continuous light (approximately 65 |iEm-2S-"f) at 22''C. "me* segments were 
transferred to B5 salts medium containing IV0 sucrose. 3 mg/I benzylamlho purine (BAPj and 1 mg/l zeatin. 
This was supplemented with 500 mg/l carbenicillin, 10, 25, or 50 mg/l kanamycin sulfate, and solidified wltli 
O.60/0 Phytagar (Gibco). Thereafter, explants- were transfenred to fresh medl urn. every two weeks. 

After one month green shoots developed from green calll which were selected .on media containing . 
kanamylcin. Shoots continued to develop for three months. The shoots were cut frorh th^ qaill. when they were, 
at least 1 cm high and placed on B5 medium with 1«Vb. sucrose, rio added growth .su'bstanpes, 300 mg/l 
carbencillin , and solidified with O.6O/0 phytagar. The siioots continued to grow and severaJ leaves were 
removed to test for neomycin phosphotransferase II (NPTll) activity. Shoots which were positlve for isiPTll . 
activity were placed in Magenta boxes containing B5 0/1/1 medium with 1o^ sudrose, 2 mg/l indolebutyric 
acid. 200 mg/l carbenicillin, and, solidified with O.6O/0 Phytagar. After a few weeks the Shoots developed roots . 
and were transfen-ed to soil. The plants were grown in a growth chamber at 22**C In a 16r8 hours light-dark 
cycle with light intensity 220 jiEm-2s-i and after several weeks were transfen-ed- to the greenhouse.' . 
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Southern Data 

Regenerated B. napus plants from cocultlvations of Agrobacterium tumefaciens EHA101 containing 
pCGN948 and B. napus hypocotyis were examined for proper integration and embryo-specific expression of 
tlie spinach leaf ACP gene. Sbutliem analysis was performed using DNA isolated from leaves of regenerated 
plants by the method of Dellaporta et al.. ( Plant Mol. Biol. Rep. (1983) 1:19-21) and punfied once by banding in 
CsCI. DNA (10 ^g) was digested with the restriction enzyme EcoRI, electrophoresed on a 0.70/o agarose gel 
and blotted to nitrocellulose (see Maniatis et al.. (1982) supra .). Blots were probed with pCGN946 DNA 
containing 1.8 l<b of jhe spinach ACP sequence or with the EcoRI/ Hindlll fragment isolated from pCGN936c 
(made by transferring the Hindlll/EcoRl fragment of pCGN930 into pCGN566) containing the napin 5' 
sequences labeled with 32p.dCTP by nick translation {described by the manufac turer. BRL Nick Translation 
Reagent Kit. Bethesda Research Laboratories, Gaithersburg. MD). Blots were prehybridized and hybridized In 
SOVoformamide. 10x Denhardt's SxSSC. O.IO/0 SDS. 5 mM EDTA. 100 >ig/ml calf thymus DNA and 100^ dextran 
sulfate (hybridization only) at 42°C. (Reagents described in Maniatis et al., (1982) supra.) Washes were in 
1xSSC, 0.10/0 SDS, 30 min and twice in O.lxSSC. 0.1 0/0 SDS at 55'' C, 

Autoradiograms showed two bands of approximately 3.3 and 3.2 kb hybridized In the EcoRI digests of DNA 
from four plants when probed with the ACP gene (pCGN945) indicating proper integration of the spinach leaf 
ACP construct in the plant genome since 3.3 and 3.2 kb Eco RI fragments are present in the T-DNA region of 
pCGN948. The gene construct was present in single or multiple loci, in the different plants as judged by the 
number of plant DNA-construct DNA border fragments detected when probed with the napin 5' sequences. 

Northern Data 

Expression of the integrated spinach leaf ACP gene from the napin promoter was detected by Northern 
analysis in seeds but not leaves of one of the transfomied plants shown to contain the construct DNA. 
Developing seeds were collected from the transformed plant 21 days post-anthesls. Embryos were dissected 
from the seeds and frozen In liquid nitrogen. Total RNA was isolated from the seed embryos and from leaves of 
the transformed plant by the method of Crouch et al.. Virology (1985) 140:281-288) and blotted to 
nitrocellulose (Thomas, Proc. Natl. Acad. Sci. USA (1980) 77:5201-5205). Blots were prehybridized, hybridized, 
and washed as described above. The probe was an Isolated Pstl /Bam HI fragment from pCGN945 containing 
only spinach leaf ACP sequences labeled by nick translation.- 

An RNA band of -0.8 kb was detected In embryos but not leaves of the transfomied plant Indicating 
seed-specific expression of the spinach leaf ACP gene. 

Example II 

Construction of B. Campestris Napin Promoter Cassette . 

A BglH partial genomic library of B. campestris DNA was made In the lambda vector Charon 35 using " 
established protocols (Maniatis etal.. 1982, supra). The titer of the amplified library was -1.2x109phage/ml. 
Four hundred thousand recombinant bacteriophage were plated at a density of 10^ per 9 x 9 in, NZY plate 
(NZYM as described in Maniatis etal., 1982. supra) in NZY + 10 mM MgSO* + O.90/0 agarose after adsorption 
to DH1 E.coli cells (Hanahan. Mol. Biol . (1983) 166:557) for 20 min at37°C. Plates were Incubated at 37"*Cfor 
-13 hours, cooled at 4'C for 2.5 hours and the phage were lifted onto Gene Screen Plus (New England 
Nuclear) by laying precut filters over, the plates for approximately 1 min and peeling them off. The adsorbed 
phage DNA was immobilized by floating the filter on 1.5 M NaCl. 0.5 M NaOH for 1 min., neutralizing in 1.5 M 
NaCl, 0.5 M Tris-HCI, pH 8.0 for 2 min.and 2XSSC for 3 min. Filters were air dried untii just damp, prehybridized 
and hybridized at 42° C as described for Southern analysis. Filters were probed for napin-contalning clones, 
using an )(hol/Sall fragment of th6 cDNA clone BE5 which was isolated from the B. campestris seed cDNA 
library described using the probe pNI (Crouch et al.. 1983, supra) . Three plaques were hybridized strongly on 
duplicate filters and were plaque purified as described (Maniatis al., 1982. supra ). 

One of the clones named lambda CGN1-2 was restriction mapped and the napin gene was localized to. 
overlapping 2.7 kb Xhol and 2,1 kb Sail restriction fragments, The two fragments were subcloned from lam bda 
CGN1-2 DNA into pCGN789 (a pUC based vector the same as pUC119 with the normal polyllnker replaced by 
the synthetic linker -5' GGAATTCGTCGACAGATCTCTGCAGCTCGAGGGATCCAAGCTT 3' (which repre- 
sents the polyllnker EcoRI, Sail, Bglll, Pstl. Xhol, BamHI. Hindlll). The identity of the subctones as.napin was ' 
confirmed by sequencing. The entire coding region ' sequence as well as extensive 5' upstream and 3' 
downstream sequences were detenrnined (Figure 2). Thejambda CGN1-2 napin gene is that encoding the • 
mRNA con-esponding to the BE5 cDNA as determined by the exact match of their nucleotide sequences. 

An expression cassette was constructed from the 5'-end and the 3'-end of the lambda CGN1-2 napin gene 
as follows In an analogous manner to the construction of pCGN944. The majority of the napin coding region of 
pCGN940 was deleted by digestion with Sail and religatlon to form pCGNISOO. Single-stranded DNA from 
pCGNIBOO was used in an in yrtro mutagenesis reaction (Adelman et al.. DNA (1983) 2:183-193) using the 
synthetic oligonucleotide 5' GCTTGTTCGCCATGGATATCTTCTGTATGTTC STlhis oligonucleotide Inserted an 
EcoRV and an Ncol restriction site at the junction of the promoter region and the ATG start'codbn of the napin 
gene. An appropriate mutant was identified by hybridization to the oligonucleotide used for the mutagenesis 
and sequence analysis and named pCGN1801. 

A 1 .7 kb promoter fragment was subcloned from pCGN1801 by partial digestion with EcoRV and ligation to 
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pCGN786 (a pCGN566 chloramphenicol based vector with the synthetic linker described above in place of the 
normal polylinl<er) cut with EcoRl and blunted by fill# # resulting expression cassette, pCGN1803 contains 
1,725 kb of napin promoter sequence, and 1.265 kb of napin 3' sequences with the unique cloning sites Sai l. 
Bgt i, Pstl, and Xhol In between. Any sequence that requires seed-spectfic transcription or expression^ In 
Brassica, i.e., a fatty acid gene could be Inserted in this cassette in a manner arialogous to that described for 5 
spinach leaf AGP and the B. napus napin cassette in Example L . 

Example 111 

Other seed-specific promoters may be isolated from genes encoding proteins Involved in seed 
triacylglycerol synthesis, such as acyl carrier protein from Brassica seeds. Immature seed werfe collected from 10 
Brassica campestris cv. "R-SOO.* a self-compatible variety of turnip rape. Whole seeds were collected at 
stages con-esponding approximately to 14 to 28 days after flowering. RNA Isolation and prepafiatlon of acDNA 
bank was as described above for the isolation, of a spinach AGP cDNA clone 'except that the vector used was 
PCGN565. To probe the cDNA bank, the oligonucleotide (5')-ACmCTCAACTGTCTCrrGGTTTAGCAGC-(3') 
was synthesized using an Applied Biosystems DN A Synthesizer, model 380A, according to marwjfacturer's /5 
recommendations. This synthetic DNA molecule will hybridize at low strtrtgenclek to DNA or FiNA. sequences 
coding for the amino acid sequence (ala-ala-lys-pro-glu-thr-vai-glu-lys-val). This amino acid sequence has 
been reported for AGP isolated from seeds of Brassica napus (Slabas et al.. 7th [ntematlonaf Symposiurn of 
the Structure and Function of Plant Lipids, Uhiverstty of Gallfomla, Davis, CA, 1986) ;' AGP from B. campestris 
seed Is highly homologous. Approximately 2200 different cDNA clones were analyzed using a colony 20 
hybridization technique (taub and Thompson, Anal. Blochem . (1982)" 128'522-230) arid hybridization 
conditions corresponding to Wood et al., ( Proc. Natl. Acad. Sci . (1985) 82:1585-1588). ONA sequence analysis 
of two cDNA clones showing obvious hybridization to the oligonucleotide: prbbe indicated that one, 
designated pGGNIBcs. indeed coded for an AGP-precursor protein by the considerable homology of the 
encoded amino acid sequence with AGP proteins described from Brassica napus (^afiias et al.. 1980 supra ). 25 
' Similarly to Example 11. the AGP cDNA clone can be used to Isolate a genomic clone fronfi whiclTan expression 
cassette can be fashioned in a manner directly analogous to the B. campestris n^lh' t^&ette. 

Other Examples 

Ninety-six clones from the 14-28 day post-anthesis B. campestris .seed cDNA library (deiscribed In the 30 
previous example) were screened by dot blot hybridization of minlprep DNA on Gene Screen Pkjs nyion filters. 
Probes used were radioactlvely labeled first-strand synthesis cDNAs made from the day >.4-28 past-anthesis 
mRNA or from B. campestris leaf mRNA. Glones which hybridized strongly to seed cDNA and little or not at all 
to leaf cDNA were catalogued. A number of clones were identified as repre&eriting ihe seed storage protein 
napin by cross-hybridization with ah Xhoi/Sall fragment of pNI (Grouch et al., 1983, sUpra ), fl B, campestris 35 
genomic clone as a source of an embryo-specific prompter.. = ^ 

Gther seed-specific genes may also sen^e as useful sources of promoters. cDN^ clones of cruclferin, the 
other major seed storage protein of B. napus , have been* identified (Simon etd., 1985, supria) and could.be 
used to screen a genomic library for promoters. 

Without knowing their specific functions, yet other cDNA clones can be classified as to their level of 46 
expression in seed tissues, their timing of expression (i.e., when post-anthesIs they are expressed} and their 
approximate representation (copy number) in the B. campestris genome. Clones fittlng'the criteria necessary 
for expressing genes related to fatty acid synthesis or other seed functions can be used td screen a genomic 
library for genomic clones which contain the 5' and 3' regulatory regions necessary for expression. The ■ 
non-coding regulatory regions can be manipulated to make a tlssue-specrflc expression cassette" In the .45 
general manner described for the napin genes in previous examples. 

One example of a cDNA clone Is EA9. It is highly expressed in seeds and not leaves from B. campestris . It 
represents a highly abundant mRNA as shown by cross-hybridization of seven other cDNAs from the library by 
dot blot hybridization. Northern blot analysis of mRNA isolated from day 14 seecT, and d^y 21 and 28 
post-anthesis embryos using a 700 bp EcoRI fragment of EA9 as a probe shows that EA9 is highly expressed 50 
at day 14 and expressed at a much lower level at day 21 and day 28. The restriction map of EA9 was determined 
•and the clone sequenced. Identification of a polyadenylation signal and of polyA tails at* the 3'^end of EA9 
confirms the orientation of the cDNA clonfi and the direction of transcription of the mRNA. The partial 
sequence provided here for clone EA9 (Figure 3) can be used to synthesize a probe which will Identify a unique " 
class of Brassica seed-specific promoters. ■ ' . 65 ■ 

It is evident from the above results, that transcription or expression can be obtained specifically in seeds, so 
as to permit the modulation of phenotype or change in properties of a product of seed, particularly of the 
embryo. It is found that one can- use transcriptional initiation regions associated with the transcription of 
sequences in. seeds in conjunction with sequences other than the normal sequence to produce endogenous 
or exogenous proteins or modulate the transcription or expression of nucleic acid sequences. In this manner, 60 
seeds can be used to produce novel products, to provide for improved protein compositions, to modify the 
distribution of fatty acid, and the like. 

All publicatons and" patent applications mentioned In' this specification are Indicative of the level of skill of 
those skilled in the art to which this invention pertains. All publications and patent applications are herein 
incorporated by reference to the same extent as if each individual publication or patent application was 65 
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specifically and individually indicated to be incorporated by reference. 



Claims 

1. A seed comprising an expression cassette, said cassette comprising a seed specific transcriptional 
- initiation region, a sequence of interest under the transcriptional regulation of said initiation region, and a 

transcriptional termination region, said expression cassette inserted into the genome of said seed at 
other than the natural site for said transcriptional initiation region. 

2. A seed according to Claim 1 , wherein said sequence of interest is an open reading frame encoding an 
endogenous protein or mutant thereof. 

3. A seed according to Claim 1 . wherein said sequence of interest Is an open reading frame encoding an 
exogenous protein. 

4. A seed according to Claim 1, wherein said sequence of interest encodes a sequence complementary 
to a transcription product of said seed. 

5. A seed according to Claim 1 , wherein said seed is of the Brassica family. 

6. An expression construct comprising a seed specific transcriptional Initiation region, a polylinker of 
less than about 100 bp having at least two restriction sites for insertion of a DNA sequence to be under 
the transcriptional control of said initiation region, and a transcriptional termination region, the sequence 
of said polylinker being other than the sequence of the gene naturally under the transcriptional control of 
said initiation region. 

7. An expression construct according to Claim 6, wherein said transcription Initiation region is from a 

Brassica seed gene. 

8. An expression construct according to Claim 7, wherein said gene is a napin gene. 

9. An expression cassette comprising a seed specific transcriptional initiation region, a DNA sequence 
of interest, other than the natural sequence joined to said initiation region, to be under the transcriptional 
control of said initiation region, and a transcriptional termination region. 

10. An expression cassette according to Claim 9, wherein said transcriptional initiation region is a 
Brassica gene initiation region, 

1 1 . An expression cassette according to Claim 10, wherein said gene is.a napin gene. 

12. An expression cassette according to Claim 10, wherein said sequence of Interest is a structural gene. 

13. An expression cassette according to Claim 12, wherein said structural gene encodes a protein in the 
biosynthetic pathway for fatty acid production. 

14. An expression cassette according to Claim 13, wherein said protein is acyl carrier protein. 

15. A. vector comprising an expression cassette according to any of Claims 9 to 14, a prokaryotic 
replication system, and a marker for selection of transformed prokaryotes comprising said marker. 

16. A method for modifying the genotype of a seed comprising: 

growing a plant to seed production, wherein cells of said plant comprise an expression cassette 

according to any of Claims 9 to 14, 

whereby seeds are produced of modified genotype. 
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liAP-270 Linear IBNGTH - 702 

SfaNi 

Pokl Maeill. MboII ■ 

1 Ai^SATGTGTCtTGGTATSTATCTi^CGAGTAACWVAAGAGAAGATGCAA^ 

19 31 $5 ■ ■ 

36 

Alul Ahalll roJcI ■ EcoRV Hgal 

II I \. I 

72 AGAGCTTTT4SAAGCC34TCASGTGTGT6CTTT4ATCTXATT6ATATCATCCAIT4GCGTrGTTTAATCSC'* *• 
82 106 117 130 

. Ddel 

143 TCTTTAGATATGTTTCTGTTTCTTTCTCAGTGTCTeAATATCTGATAAGTGCAATGTGAGAAAGCCACACC 2 

169 

. • Taql 
Sapl Hinfl 

214 AAACCAAAAXATTCAAATCTTATAMTTTAATAATGTCGAATCACTCGGAGTTGCCACCTTCrGTGCCAAT. 2 

251 

Hin£I Mboll EcoRl ■ 

285 WTGCT6AATCTATCACACTAAAAAAAXCATTTCTTCAAGGTAAT6ACTTGTGGACTATC07TCTGAATTC 3 
292 310 35i 

Maell- 

356 TCATTAAGTTTTTATTTTTTGAAGTTTAAGTTTTTACCTTCTTTTTTGAAAA&TATCGTTCASAAGATGTC 4 

• <2* ■ ■ 

Sphl 

Scm Nlalll N8p(7524)I , 

zcoRii Alul sfaNI Nlalll Maeixi 

II II I I I 

427 ACGCCAGGACATGAGCTACACATCACAXATTAGCATGCAGATGCGGACGATTTGTCACTCACTTCAAACAC 4 

432 440 454 

464 
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Sflpi • PiraCl 

Tthlllli Ndal N$p(7524)I Maell 

Avalll triairi Afliii upni Nlalii 

498 CTAAAAOAGCTTCTCTCTCACAa:ACACACACATATGCATGC^^ .5 
- *5Zfl - 532 539 648 5SS 563 

W8 - 531 539 550 

. 539 553 
543 . 5Sl 

. HphI Mnll 
\ I 

569 TCCATTCTCACCTATAAATTAGAOC^TCGGCTTCACTTTTTACTCAAACCJUU^ g 
COS 583 ^ ■ ■ 

Mboll Alul Mnll 

640 VACACAAAIfiGCGAACAAeCTCTTCCTCGTCTCGGCAACICTCGCCTTGTTCTTCCTTCTCAC 702 
o53 659 675 

SITE [NAP-270] : Q 



cena ^^S^^itf^riSS^^L''!?! P^^'^ter region of the \BnN4 napin 

i?^^^ °* "^^^ reading frame is underlined. 
SSSi^iSS^rnS^e'^r*^ nucleotides in the. sequence are 
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Lambda CGNl-2 

NCG-166 Linear LENGTH - 4325 

Xhol 

'»ql • Hindlll •■ 

.. Av&I Alttl Taqi 

II I I .1 . 

1 CTCCASGCAGTCACTAACATGAACTTTGACGAGGASCCCAMTXTGGfiAAGCTTATTTCTCTTTTCCAT . 6d 

2 • S2 66' 

3 50 

2 

SacI '■ ■ ■ 

Bhai xbal Alul. 
70 ACCCTAATTGAQCCSTGCGCTCTATCTAGACCAATIAGAATTGATGGAGCTCTAAAGGTTGCTGGCTGT 138 

89 95 ' 119 ■ 

121 . 

Hdel Kdel 
I I 
139 tttcttgotcatatgattaacttctaaacttgtgtataaatatcctct6aaagtgcttcttttggca?a 2 07 

150 2.06 
208 T6TAG6T7GG6CAAAAACGAGQAAGAITQCTTCTCAAT<ITGGAAGA6GAa!GAACAGCC'SAA6AAGAAAA 27f 



Sau3AI 
Ddel 

277 TAXGAATAGGCAGTC WGCTACTCAATGGAXCTCAGTCTATAACGGTCGTCOTCCCATGAAACAGAGGT 345 

309 ■ ■ ' ' 

305 

BcoRV • 

346 AAAACATTTTTTGCATATACACTWGAAAGTTCCTCACTAACTGTGTAATCTTTTGGTAGATATCACTA 414 

• .-408 
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KiACZI . 

Hhai . 
H&eZIZ 
Ddel . 
BstEII 

Ball - Haelll Alul 

415 CAATGTCGGA6AGACAA3GGCTGMNCMlCATATACAmGGGAAATeAAGATG6GCKTTSATTAGCT 483 

439 . 469 . 481 . 

438 

439 
'439 

.440 
43S 



Alul Hinfl 
484 TGTAGCATCAGCAGCTAATCTCTGGGCTCTCATCATGGATGCTGGAMTGGATTCACTTCTCAAGTTTA 552 

498 . 535 . 



Mapi 

Hpall Hinfl 
I I 
553 TGAGTTGTCACCGGTCTTCCTACACAA6GTAATAATCAGTT6AAQCAATTAAGAATCAATTTGATTT6T $21 

564 606 
564 



Ddel 

622 AGTAAACTAA6AA6AACTTACCTTATGTTTTCCCCGCAGGACTG6ATVAIGGAACAATG6GAAAAGAAC 690 
■ 629 



Sacl 

Aim AluI Alul 

691 TACTATATXAGCTCCATAGCIGGTTC AGATAACGGGAGCTCTTT AGTTGTTATGTCAAAXGGTTAGTGT 759 

702 710 729 

731 



760 TTAGTGAATAATAAACTTATACCACAAAGTCTTCAT TGACTTATTTATATACTTGTTGTGAATTGCTAG ■ 826 



Ddel Hinfl ' 

829 GAACTACTTATTCTCAGCAGTCATACAAAGTGAOTGACTCATITCCGTTCAAGTGGATAAATAAGAAAT 897 
842 865 
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. : Taqi 

898 GSAAAGAAGATTTTC^TGTAXCCTCCATSACAACTGCTOGT/ATCSTTGGGGTGTIJI^^ ggg 
■ • - 961' . 

Sau3Al 
Bell 

967 ACTCTGGCTTCTCTaATCAGGTAGGTTTTTGTCTCTTATTGTCTGGTGCTTmTTTTC 1035 

981 • 
. 981 

Alul Raal 

1036 CTAATATGATAAACTCTGCGTTGTGAAAGGTGGTGGAGCTTGACITTTtSTACCCAAGCGATGGGATAC U04 

1074 1087 



1105 ATAGGAGGTGG6AGAAT6GGTATAGAATAAatCAAT6GCAGCAACT6CG6Aia«GCAGCTTT«^^ 1173 



SauS^l Aim 
;GATCAAGCAGCT7TC 
1155 . 1165 



HlftfX . • R«al 

1174 taagcataccaaagcgtaagatggtggatgaaactcaagagactctccgcaccaccgcctttcovagta^ ^ 

1215 . 1242 

1242 . 

Aluj Seu3Al : ■ p4,i • 

1243 CTCATSTCAAGGTTGGTTTCTTTAGCTTTGAACACAGATTTGGATCKTTT$TTTTOTTTCCAfATACT 1311 

1268 . 1285 ' 1311. ; 

Sdel 

Avail Alul ■ Hlnfl Raal 

III I I 

1312 TAGGACCTGAGAGCTTTTGGTTGATTTTTTTTTCAGGACAAAT6GGCGAAGAATCTGTACATTGCATCA ilS&O . 

"25, 1363- 1370 ' 

1319 

1381 ATATGCTATGGCAQGACAGTCTGCTGATACACACTTAAGCATCATGTGGAAAGCCAAAGACAATTGGAG 1449 ■ 
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Hinfl 
Ddei ■ • 

M . • 

1450 CflAGACTCA(5SGTC6TCATAATACCAATCAAAGACGTAAAACCAGACGCAACCTCTTTGGTTC5AATGTA 1515 

1456 ■ 
. 1454 

Rs&Z 

1519 ATGAAAGGGATGTSTCTTGGTATGTATGTACGAATAACAAAAGAGAAGATGGSATTAGTAGTAGAA^ 1587 

1548 

Alul ECORV 

I I 

158 B TTTGGGAGCTTTTTAAGCCCTTCAAOTGTGCITTTTATCTTATTGATATCATCCATTTGCGTTGTTTAA 1656 
1596 • . 1535 

Xbal Ddel ■ 

16S7 TGCGTCTraAGATATGTTCCTATATCTTTCTCASTGlCTGATAAGTGAAATGtfiASAAAACCATACCAA 1725 
1664 1J87 

Hinfl 

1726 ACCAAAATATTCAAATCTTXTTTTTAATAATGTT6AATCACTCGGAGTTGCCACCTTCTGTGCCAATTG .1794 

1761 

^^^^ ... BCORI 

1795 TMTGAATCTATCACACTAGAAAAAAACATTTCTTCAAQGTAATGACTTGTGGACTATGTTCTGAATTC 1863 
1800 ■ ■ " ■ ' ^ 1359. 

1864 TCATTAAGTTTTTATTITCTGAAGTTTAAGTTTTTACCTTCTGTTTTGAAAIATATCGTXCATAAGATG 1 932 



SphI 

BStNI .Alul S«U3AI 

1933 TCACGCCAGGACATGAGCTACACATCGCACATAGCATGCAGATCAGGACGATTTGTCACTCACTTCAAA 2001 

1940 1950 1973 

1971 
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Ddel Alul Hhsl Ndel NBil sau3AI 

2002 CACC?WC»GCTTCICTCTCACACSCGCACACACAkTGCATGCMTATTTACACGTa^ J070 
2012 2028 2036 2042 2058- 



2044 



2071 ATCTCCATTCTCACCTATAAATTAGAWJCTCGGCTTCACTCTTTACSCAAA^^ 2139 

Alul 

2140 eAACATACACAAATGJCSWiCAAGCTCTTWTCGTCTCGQCAACTCTCGCCTTGTTCTT 2208 

METAlaAsnLysLeuPhBLeuValSerAlttThxLeuAlaLftvtfhtPheLauleuThr 
2X64 

Jffl^ Maei 
Sail Mgpi 

2220 2241 2271 

2239 2260 

2240 ■• 2268 
• 2241 ,2263 

uj^A-r BindZII 
Hi'lf I Alul • • 

2327 
• 2325 

Mapl Avail 

HpAll Avail Alul TaqI 

lysGlnAlalffiTGlnSerGlySarGljProSerTrpThrLauAapGlyaiuPheAflpPh 
2364 2382 
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HaftXZI SacI 

Apal fiaelll Alui 

i I I M 

2416 GTGGAGAACCAACAACA6G6CCCGCA6CAGA6GCCACC6CTGCTCCACK:ASTQCTeCj^C'GA6CTCa^^ 2^84 
ValGluA«n61n61nQlA61yProGln61nArgProfroLeuLeuGInGlnCy8Cy«A8n$luiieuHi§ 

243$ ■ 2449 2479 

2436 ..!■•'•.. .2481 • 

• BstNI ... Hiftf I . 

I - I I • 

2485 CACGAACAGCCACTTTGCGTTTGCCCAACCTTGAAAGGA6CATCCAAAGCCGTTAAACAAC&GATTC5A 25S3 
GlnGluGluSroLeuCysValCyBProThrLeuLyaGiyAlaserLysAlavalii/sGlnGlnll'oArb 
2486 ■ , ■ 2548 

2S5i . 

2554 CAACAACAGGGACAACAAATGCAGGGACAGCAGATGCAGCAAGTGATTAGCCGTATCtACCAGACCGCT 2622 
QlAGlnGlnGlySlh61nMETSln6IySinGinKETGlnainVallIeSerAz.gtil«TyeGlnTh« Ala 

Alul Bstkl 
I .1 
2623 ACGCACTTACCTAGAGCTTGCAACATCAGCCAAGTTAGCATTTGCCCCTTCCyWsAAGACCATGCCTGGO 2691 
'rhxui8i<aus>roArgAlaC^BAsnlleAcg6lnv&l$erileCys?ro?heGlnLy«llixiiiE'rPruG 

Mspl 

. Bpall Xhol 
Raalll Taql 

Apax HiAfi Aval AceZ 

II I III 

2692 CCCGGCTTCTACTAGATTCCAAACGAATATCCTCGAfiAGTGTQTATACCACGGTGATATGAGTGTGGTT 2760 
ProGly2h«Tyz , 

2694 2707 2724 2736 

2692 2725 

2694 2724 ' 

269-4 . . 

Hindi Rsal 
I ■ I . ■ , 

2761 G7TGA7GTATG7TAACACTACAIAGTCATGGTGTGT67TCCATAAATAA7GTACTAATGTAATAAGAAC 2829 

I 

2771 2ai3 

ACCI 

I ■ 

2830 TACTCCGTAGACGGTAATAAAAGAGAAGTTTTTTTTTTTACTCTTGCTACTTTCCTATAAAGTGATGAT 2898 
2838 
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SdaZ 
RaaX. 

2899 TAACMCAGATACACCXAAAAGAmCAATTAATCTATATTCACAATGAASCAGTACTAGrCTAT 2967 

•. 29S4 
29154 

Sau3AT 

■..2968 CATGTCA6ATTTTCTTTTTCTAAATGTCTAAtTAAGCCTTaUU3»WftGTC»TGAT 3036 

■ . 3028 

Sau3Al SauSAI 
BamHl HiofI Bell 

I I I 

3037 ATeGGATCCAACAAAGACTCAAATCTGGTTTTGAlCAGATACTTCAAAACTATXTTTGTATTCATTAAA 3105 

3041 .3053. • 3069. ■ 

3041 3069 

Hinfl 

3106 TTATGCAAGTGTTCTTTTATTTGGTGAAGACTCTTTAGAAGCAAAGAACGACAAGCAGTAATAAAAAAA 3174 

3135 ' ' 

3175 ACAAAGTTCAGTTTTAAGATTTGTTATTGACITATTGTCATTTGAAAAATATAGIATGATATTAATATA 3243 
3244 GTTTTATTTATATAATGCTTGTCTATTCAAGAITTCAGAACATmTATGATACTGTCCACATATCCAA 3312 



Ndel 

3313 TATATTAA6TTTCATTTCTGTTCAAACATAT6ATAAGATG6TCAAATGATTATGACTTTTGTTA7TTAC 3381 

3341 

'aql .Sau3Al • 

Alui Real 

33,82 CTGAACAAAAGAIAAGTGAGCTTCGAGTTTCTGAAGGGTACGTGAiCTTCATTTCTTQGCTAAAAGCGA 34 50 

3402 3421 
3405 • 3425 

3451 ATATGACATCACCTAGAGAAAGCCGATAATAGTAAACTCTGTTCTTGGTTTTTGGTTTAXTCAAACCGA -3519 
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Mspl 

Mspl Ddel Hpall 
Hpall Alul Ndei; • Hinfl 

-in - I I ■• = I . - 

3520 ACC6GTAGCTGA67GTCAM3TCA6CAAACATC6CftAACCATAT(3TCMlTTGOTTAGATTCpC66TTTAA 35Si8 

3522 3528 3560 3576 

3522 3S2d 3581 

■ 3:5.81 

M«pl 
Bpall 

I 

35B9 GTTGTAAACCGGTATTTCATTTGGTGAAAACCCTAGAASCCAQCCANCCTTTTTAAiCTAATTTTIGCA 36.57 

3598 
3598 

Hinfl 

BiQClI 

Mel BstNI 

3656 AAC6AGAAGTCACCACACCTCTCCACTAAAACCCTGAACCTTACTSAGAGAA6CAGA&KCAMKAAAGAA 3726 

3702 3718 . 

3715 
• 3714 

3727 CAAATAAAACCCGAAGATGAGACCACCACGTGCGGCGGGACQTTCAGGQGACGGGGAGGAAGfASAATGR 3795 



Avail • 

Alul Avail 
II .1 
3796 CGGCGG 5KK7T7GGTGGC6GCGGCGGAC6TTTT0GT6GCGGCG6TGGAC6T7TTQGT66C6GC6G7GGA 3864 

3804 3863 
3801 

. EcoRV Avail Ddel 

I • I • .1 

3865 CCTTTGGTGGTGGA7ATCGTGACGAAGGACCTCCCAGTGAAG7CATTGGTTCGTTTAqTC77TTCTTAG ,3933 

3880 3692 3930 
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Hindlll 

' ' III 
3 934 TCGAATCTTATTCTT6CTCTGCTCGTTGTTTTACCGAXAAAGCTTAAGACTTTATTGATAAAGTTCTCA .4002 

3937 - 3976 

3935 . 3974 

Alul Xnml ainfl d^^I 

I II 
4 5C3 GCTTIGAATGTGAXTGAACTGTTTCCTGCTTATTAGTeTTCCrroSTTT'rfiASMGAATCACTGTCTTA 4071 

«04 4023 4059 4069 . 

Hinfl 

4072 GCACTTTTGTTAGATTCATCTTTGTGTTTAAGTTAAAAGGTAGAAACTTTGTGACTTGTCTCC6TTAT6 4140 

4085 

Hindi 

I. 

4141. ACAAGGTTAACTTTGTTGGTTATAACAGAAGMGCGACCTTTCTCCATGCTTGTCAGGiSTGATGCTGTG 4209 
. 4146 

AvetZI Alul Dd6l 5au3AI 
I II I 

4210 GACCAAGCTCTCTCAGGCGAAGATCCCTTACTTCAATGCCCCAATCTAClTeGAAAACAAGACACASAT 4278 
4210 4217 4222 4231 

Taqi 
Sill 
PstI 

Hlndlll BlncIZ 
8au3AI AluZ AOOI EeoRi . 

4279 TGGGAAAGTTGATGAGATCCAAGCTTGGGCTGCA66TCGACSAATTC 4325. 

'1 4294 4302 4316 4321 

4300 4314 
4313 . 
4315 
- 4316 
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EA9-14 Linear 



LZNGTH - 416 



ScrFI 
Haalll 
■ Hpall 

C.UII - Mnll Hpiij 

1 CCAACCCSGTTAGAATTffCAXtAeeSGiMS&i^aAaTTGGTt'ffGi^ 

7 . fii • .5:9.- 

9 . 

7 



71 



Xftatil ■ 

Mmel . {jbo'l 

Mboi iaeill 
Dpnl <3<JiII 
Biml Bell Miaixi . Cfifet. £ij>fti 

72 CSGAGAATGCACMCTCATCTlGATCACGGGSTAXCTGCGGTTGGATACGSCCGAicTAAAAACGGATT^ If 
.82. S3 102 lie 126 

. 95 I2i3 
93 • .:122 
93 124 

120 



143 



Xhell 
ScaZ Hiaiv 

Hbol 
Rsai Dpnl 
BlnZ BunHI 
II II 



Nlaiv 
Nlaiii. 
rini 
XcoRI AviZI 
Bini HnlZ Aiuz 
Mil 



144 149 157 .163 169 

145 151 159 169 

149 167 

14b 151 16? 

149 170 



MaaZZ 

I. I 



186 



190 



Klaiii 
Mbpl 
OpaZ 

.1 I 



202 
200 
i9S 



BinI SCO 

I I 

213 
208 213 



214 



Nlazzz Kiaii 
l«ml 

I I I 



25a 

249 259 



.TCGGTTC 2«-'-, 



K«p{7524)I 
Kiaizi • 



Hindziz 

. Hp«ii Aiuz aihiz 

' ' ' I ■ I 

285 AATATCCGGTTAAGCTTTACaU^iaaaXGTGTGTiSTTGGTTXlAATTTAAGACTiSTGT'rGCAtGTAATTTGT 35 i 
291 299 . 3315 346 

297 348 
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SfsNI 

I ...... 

365 



Partial aequenca of a* g««fe^.ij.ferl> ttttS. cmk fiA9 fuf Si.eieijt 
to Identify a geziomie Stalling a ^t6n^i«r. f^n^ ^oly- 



adanylatlon aignal lUlTAftA is «zideriinid «1 a|ft tim p#l^A %^ila, 
The stop codon ofi the prestnttad eqpen tift^iljbi.^ wti 
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